Skip to content

Optimize aligned TVList with lazy column allocation and runtime null bitmap.#17769

Open
luoluoyuyu wants to merge 2 commits into
apache:masterfrom
luoluoyuyu:opt/aligned-tvlist-lazy-column-null
Open

Optimize aligned TVList with lazy column allocation and runtime null bitmap.#17769
luoluoyuyu wants to merge 2 commits into
apache:masterfrom
luoluoyuyu:opt/aligned-tvlist-lazy-column-null

Conversation

@luoluoyuyu
Copy link
Copy Markdown
Member

Description

Allocate value arrays on write only, use per-chunk BitMaps for explicit nulls, and treat unmaterialized columns as null. Adapt query iterators, flush encoding, and memtable memory estimation for sparse aligned writes.


This PR has:

  • been self-reviewed.
    • concurrent read
    • concurrent write
    • concurrent read and write
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods.
  • added or updated version, license, or notice information
  • added comments explaining the "why" and the intent of the code wherever would not be obvious
    for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold
    for code coverage.
  • added integration tests.
  • been tested in a test IoTDB cluster.

Key changed/added classes (or packages if there are too many classes) in this PR

…bitmap.

Allocate value arrays on write only, use per-chunk BitMaps for explicit nulls,
and treat unmaterialized columns as null. Adapt query iterators, flush encoding,
and memtable memory estimation for sparse aligned writes.
@Caideyipi
Copy link
Copy Markdown
Collaborator

  1. AlignedTVList.java:1430

(

size += ReadWriteIOUtils.sizeToWrite(getBinaryByValueIndex(rowIdx, columnIndex));
)
serializedSize() still calls getBinaryByValueIndex() unconditionally for TEXT/BLOB/STRING/OBJECT columns. After this PR, a whole value array can be null to
represent an unwritten sparse aligned column. In that case WAL size calculation can hit an NPE before serializeToWAL() gets to its valueArray == null
handling.
2. TsFileProcessor.java:817

(

if ((alignedMemChunk.alignedListSize() % PrimitiveArrayManager.ARRAY_SIZE) == 0) {
)
/ AlignedWritableMemChunk.java:905

(

)
For an existing aligned memchunk, when an insert expands the TVList into a new chunk, the memory estimate adds only alignedTvListArrayMemCost(). The extra
value-array estimate checks only whether the current last chunk is unallocated. If the previous last chunk already had a value array, but the new chunk
will allocate value arrays for this insert, those value arrays are not counted. This can underestimate memtable memory usage and weaken write memory
control.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants